-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Rewrite graph-cli in Rust #6282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
0f39307 to
021d897
Compare
fbf5d36 to
64a077d
Compare
526dc48 to
5c086e6
Compare
|
Couple of things I noticed for the commands that i'm familiar with:
|
8e5442c to
1728763
Compare
2d2a5e8 to
39e7a64
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Differences and regressions I noticed compared to the graph-cli when running init interactively with the Smart Contract source
-
Step ordering
Some prompts are ordered differently (for example, subgraph slug and directory).
I actually think thegndorder makes more sense here, since it may allow suggestions based on the fetched contract information. -
Contract info fetching (regression)
Ingraph-cli, contract information is fetched immediately after the contract address is entered. The CLI then suggests the fetched contract name, start block, and would prompt for ABI path only if fetching fails.
Ingnd, these values are prompted before fetching the contract info, and later, it appears that the fetched data is not used and the defaults or user input are kept instead.
(I’ve added code snippets below showing the step order in both CLIs.) -
Adding additional contracts
graph-cliprompts the user at the end ofinittoaddanother contract, while gnd does not.
graph-cli:
$ graph init
✔ Network · Arbitrum One Mainnet · arbitrum-one · https://arbiscan.io
✔ Source · Smart Contract · ethereum
✔ Subgraph slug · graph-token
✔ Directory to create the subgraph in · graph-token
✔ Contract address · 0x9623063377AD1B27544C965cCd7342f7EA7e88C7
✔ Fetching ABI from Sourcify API...
✔ Fetching ABI from Contract API...
✔ Fetching start block from Contract API...
✔ Fetching contract name from Contract API...
✔ Start block · 42449274
✔ Contract name · GraphProxy
✔ Index contract events as entities (Y/n) · false
Generate subgraph
Write subgraph to directory
✔ Create subgraph scaffold
✔ Initialize networks config
✔ Initialize subgraph repository
✔ Install dependencies with yarn
✔ Generate ABI and schema types with yarn codegen
✔ Add another contract? (y/N) · false
Subgraph graph-token created in graph-tokengnd:
$ gnd init
Creating a new subgraph...
> Source: Smart contract
> Network: Arbitrum One Mainnet (arbitrum-one)
> Contract address: 0x9623063377AD1B27544C965cCd7342f7EA7e88C7
> Contract name: Contract
> Subgraph slug: contract
> Directory: contract
> Do you have an ABI file? No
> Start block: 0
> Index contract events as entities? No
→ Fetching contract info from 0x9623063377AD1B27544C965cCd7342f7EA7e88C7 on arbitrum-one
✔ Found contract: GraphProxy
→ Generating scaffold files
→ Initializing Git repository
✔ Install dependencies with pnpm
✔ Subgraph created at contract
Next steps:
cd contract
gnd codegen
gnd build-
Directory name collisions
If the target directory already exists,graph-cliasks whether it should be overwritten.
gndexits with an error instead. While this may not strictly be a regression, I think it would make more sense to prompt the user to choose a different directory name to avoid the collision. -
Obsolete
spkgflag
Not a regression, but gnd still exposes the spkg flag, which appears to be obsolete now that SPS is deprecated and the Substreams option has been removed. -
gnddoes not create thenetworks.jsonfile with the chain info, that is used for deploying the subgraph to multiple networks
{
"arbitrum-one": {
"GraphProxy": {
"address": "0x9623063377AD1B27544C965cCd7342f7EA7e88C7",
"startBlock": 42449274
}
}
}I’ll follow up with additional comments if I find any more differences or regressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two more things i noticed for init:
gndwill add the contract address as a hex number in the manifest, instead of as a string
# gnd
source:
abi: Graph
address: 0x9623063377AD1B27544C965cCd7342f7EA7e88C7
# graph-cli
source:
address: "0x9623063377AD1B27544C965cCd7342f7EA7e88C7"gnd, when scaffolding the manifest, will add the mappings file without prependign./srcso it will only contain the mappings file name. https://github.com/graphprotocol/graph-node/blob/lutter/gnd-cli/gnd/src/scaffold/manifest.rs#L9
Theaddcommand on the other hand, correctly prependssrc
# gnd
...
file: ./graph.ts
...
# graph-cli
...
file: ./src/graph-proxy.ts
...So graph-cli won't be able to work with gnd generated subgraphs:
✖ Failed to load subgraph from subgraph.yaml: Error in subgraph.yaml:
Path: dataSources > 0 > source > address
Expected string, found number:
8.571296818927177e+47
Path: dataSources > 0 > mapping > file
File does not exist: graph.ts
Path: dataSources > 1 > source > address
Expected string, found number:
1.0021244402728634e+48Also gnd build will fail, as it expects the path to contain ./src
Manifest validation errors:
- Mapping file for data source 'Graph' not found: ./graph.tsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the add command (in addition to what is mentioned here #6282 (comment))
- In
gndtheaddcommand accepts a--networkflag, which in the case ofadddoes not make sense, as subgraphs do not support multichain datasources, so the chain should always be derived from the manifest file. I guess Claude got confused here and added--networkinstead of--network-file, which should be the path to the networks config file (if a custom one is used). - Similar to
init,adddoes not update thenetworksfile and this functionality has been ignored completely.
Other than that seems to be working fine.
| r#" | ||
| let id = this.get('id') | ||
| if (id == null || id.kind == ValueKind.NULL) {{ | ||
| store.set('{}', 'auto', this) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure where this comes from. I'm not familiar with timeseries and aggregation entities, which seem to have auto incremented Int8 ids (at least according to the docs)
gnd codegen will produce this for a regular entity from the schema. The comment before the constructor seems incorrect, at least for regular entities. Also I don't think regular entities with and id of type ID or Bytes, should be auto-incremented. Also (2) if (id !== null) <--- this won't work with AS < 0.20.0, before === and == were unified https://github.com/AssemblyScript/assemblyscript/releases/tag/v0.20.0
export class ExampleEntity extends Entity {
/**
* Leaving out the id argument uses an autoincrementing id.
*/
constructor(id: Bytes | null = null) {
super();
if (id !== null) {
this.set("id", Value.fromBytes(id));
}
}
save(): void {
let id = this.get("id");
if (id == null || id.kind == ValueKind.NULL) {
store.set("ExampleEntity", "auto", this);
} else {
assert(
id.kind == ValueKind.BYTES,
`Entities of type ExampleEntity must have an ID of type Bytes but the id '${id.displayData()}' is of type ${id.displayKind()}`,
);
store.set("ExampleEntity", id.toBytes().toHexString(), this);
}
}Also this is what gnd generates for a timeseries Entity
export class Data extends Entity {
/**
* Leaving out the id argument uses an autoincrementing id.
*/
constructor(id: i64 = i64.MIN_VALUE) {
super();
if (id != i64.MIN_VALUE) {
this.set("id", Value.fromI64(id));
}
}
save(): void {
let id = this.get("id");
if (id == null || id.kind == ValueKind.NULL) {
store.set("Data", "auto", this);
} else {
assert(
id.kind == ValueKind.INT8,
`Entities of type Data must have an ID of type Int8 but the id '${id.displayData()}' is of type ${id.displayKind()}`,
);
store.set("Data", id.toI64().toString(), this);
}
}
...
}and this is from graph-cli v0.98.1:
export class Data extends Entity {
constructor(id: Int8) {
super();
this.set("id", Value.fromI64(id));
}
save(): void {
let id = this.get("id");
assert(id != null, "Cannot save Data entity without an ID");
if (id) {
assert(
id.kind == ValueKind.INT8,
`Entities of type Data must have an ID of type Int8 but the id '${id.displayData()}' is of type ${id.displayKind()}`,
);
store.set("Data", id.toI64().toString(), this);
}
}
...
}I'm a little bit confused here :D Both are specVersion: 1.3.0 and apiVersion: 0.0.9
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A long time ago, these generated constructors took strings; since PR #5029, graph-node accepts "auto" for these constructors if the type of the id is Int8 or Bytes to mean "generate an id". This was meant as a general facility to keep people from generating insanely long ids when they don't really need them/don't care about them.
At some point, graph-cli was changed so that these constructors took Int8 and Bytes, making it impossible to use "auto". This updated codegen tries to make that possible again.
As for the id !== null comparison, that seems to work. I am attaching a little example that demonstrates that. When I run that all the asserts are successful. But I might be missing some big footgun
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah i'm not completelly sure about == vs === either, what the exact conditions/cases are. I remember people had issues when loading an enitity and comparing with null using ===, or comparing values coming from an event against some pre-defined values. One case that I could reproduce is:
test("=== vs ==", () => {
let helloWorld = "hello world";
let hello = "hello";
let hello_2 = helloWorld.slice(0, 5);
log.info("Test ===", []);
if (hello_2 === hello) {
log.info("Strings are equal with ===: {}", [hello_2]);
} else {
log.info("Strings are not equal with ===: {}", [hello_2]);
}
log.info("Test ==", []);
if (hello_2 == hello) {
log.info("Strings are equal with ==: {}", [hello_2]);
} else {
log.info("Strings are not equal with ==: {}", [hello_2]);
}
});and the result is
Compiling...
💬 Compiling equality...
Igniting tests 🔥
equality
--------------------------------------------------
√ === vs == - 0.029ms
💬 Test ===
💬 Strings are not equal with ===: hello
💬 Test ==
💬 Strings are equal with ==: helloThe init command's interactive mode now fetches contract information (ABI, name, start block) from Etherscan/Sourcify immediately after the user provides the network and contract address, before prompting for remaining values. This allows fetched values to be used as defaults: - Contract name defaults to the fetched name - Start block defaults to the fetched deployment block - ABI prompt is skipped if fetch was successful Addresses PR #6282 review feedback issue I1.
The --spkg flag was for Substreams support which is not implemented in the init command. Removing the unused flag to avoid confusion. Addresses PR #6282 review feedback issue I4.
The init command's interactive mode now fetches contract information (ABI, name, start block) from Etherscan/Sourcify immediately after the user provides the network and contract address, before prompting for remaining values. This allows fetched values to be used as defaults: - Contract name defaults to the fetched name - Start block defaults to the fetched deployment block - ABI prompt is skipped if fetch was successful Addresses PR #6282 review feedback issue I1.
The --spkg flag was for Substreams support which is not implemented in the init command. Removing the unused flag to avoid confusion. Addresses PR #6282 review feedback issue I4.
c3c2e64 to
4194a74
Compare
The init command's interactive mode now fetches contract information (ABI, name, start block) from Etherscan/Sourcify immediately after the user provides the network and contract address, before prompting for remaining values. This allows fetched values to be used as defaults: - Contract name defaults to the fetched name - Start block defaults to the fetched deployment block - ABI prompt is skipped if fetch was successful Addresses PR #6282 review feedback issue I1.
The --spkg flag was for Substreams support which is not implemented in the init command. Removing the unused flag to avoid confusion. Addresses PR #6282 review feedback issue I4.
The --spkg flag was for Substreams support which is not implemented in the init command. Removing the unused flag to avoid confusion. Addresses PR #6282 review feedback issue I4.
The test was failing because it expected `gnd build` to automatically run codegen when generated files don't exist. While the spec says build should "Run codegen (unless types already exist)", this feature isn't implemented yet. Fix the test to explicitly run `gnd codegen` before `gnd build`, which matches what users would typically do and makes the test more explicit about what it's testing.
After creating a subgraph from a contract, gnd now prompts "Add another contract from <network>?" in interactive mode. This allows users to add multiple contracts in a single session, similar to graph-cli behavior. The flow: 1. User completes init --from-contract 2. gnd asks if they want to add another contract 3. If yes, prompts for address, name, start block 4. Runs the add command internally with merge_entities=true 5. Repeats until user says no Implements I2 from the PR review feedback plan.
When the target directory already exists during `gnd init`, instead of failing with an error, the user is now prompted to enter an alternative directory name. This applies to all init modes: --from-contract, --from-example, and --from-subgraph. The new `resolve_directory_collision()` function in prompt.rs handles the prompting loop, continuing until a non-existent directory is chosen.
When running `gnd init` interactively with a contract address, the contract info was fetched twice: once in InitForm::run_interactive() to get defaults for prompts, and again in init_from_contract() to generate the scaffold. Store the fetched ContractInfo in InitForm and pass it through to init_from_contract() to skip the redundant network request.
Consolidate duplicated code across build and codegen commands: - Create manifest.rs with unified Manifest types and load_manifest() - Create watch.rs with generic watch_and_run() for --watch mode - Consolidate compile_data_source_mapping/compile_template_mapping - Consolidate copy_abi/copy_template_abi into copy_abi_to_dir - Add output_extension() helper for wasm/wast conditionals Removes ~225 lines of duplicated code while improving maintainability.
Make `gnd init` generate scaffold files that match `graph-cli init` output for better compatibility when --index-events is not set (placeholder mode). Changes: - manifest: Put address before abi in source, use event names for entities (not ExampleEntity), remove blank lines between event handlers - schema: Use @entity(immutable: true), include first 2 event params with type comments, remove block fields from example entity - mapping: Import contract class and all events, generate handlers for all events (first with full code, rest empty stubs), use correct ID pattern with Bytes.fromByteArray, add extended comments with callable functions
When manifest_path is a bare filename without directory component,
Path::parent() returns Some("") (empty string), not None. The old code
only handled None with unwrap_or_else, causing canonicalize() to fail.
Add manifest_dir() helper that filters empty paths and use it
consistently across build, add, and migrations commands.
Extract RESERVED_WORDS, handle_reserved_word(), and capitalize() into a shared module to reduce code duplication across codegen. Previously these utilities were duplicated in: - codegen/schema.rs (RESERVED_WORDS with 47 items, handle_reserved_word) - codegen/abi.rs (RESERVED_WORDS with 43 items, handle_reserved_word, capitalize) The shared module uses a unified RESERVED_WORDS list that includes all words from both lists (including "await" from abi.rs).
…formats
Extract normalize_abi_json() to a shared abi module and use it in
copy_abi_to_dir (build.rs) so that Hardhat artifacts ({"abi":[...]})
and Truffle artifacts ({"compilerOutput":{"abi":[...]}}) are stripped
down to the bare ABI array before being written to the build output.
Previously, a raw fs::copy would pass through the metadata wrapper,
causing "expected a valid JSON ABI sequence" failures.
Add a quiet flag to AddOpt so that run_add skips printing "Next steps" when called from the add-more-contracts loop inside init. The message now prints exactly once, at the end of init_from_contract.
Call run_codegen directly after dependency installation so that init --from-contract produces a ready-to-build subgraph, matching the existing behavior of init --from-example.
Entity reference fields (e.g., `token: Token!`) were hardcoded to use String for getters/setters regardless of the referenced entity's actual ID type. Now looks up the referenced entity's IdFieldKind so that e.g. a reference to an entity with `id: Bytes!` correctly generates `toBytes()`/`fromBytes()` instead of `toString()`/`fromString()`. Also fixes derived field getters on entities with Bytes IDs to use `toBytes().toHexString()` for the loader constructor argument.
Replace gnd's lenient manual JSON/YAML parsing with graph-node's `UnresolvedSubgraphManifest<Chain>::parse()`. This uses the same serde types as graph-node itself, so required fields like `source.abi` and `mapping.abis` are now enforced at parse time. Changes: - Add `graph-chain-ethereum` as a direct gnd dependency - Rewrite `load_manifest()` to use `UnresolvedSubgraphManifest::parse()` - Convert parsed graph-node types into gnd's existing Manifest/DataSource /Template/Abi types to minimize caller changes - Re-export unresolved Ethereum types from graph-chain-ethereum for future validation steps - Add `UnresolvedSource::address()` accessor on subgraph data sources - Add tests for missing `source.abi` and missing `mapping.abis` rejection - Update test manifests to include all required fields
Add file-level manifest validation to codegen so that missing schema files, mapping files, and ABI files are caught early during `gnd codegen` rather than only during `gnd build`. Introduce `validate_manifest_files()` which checks file existence and ABI validity without enforcing deployment-level concerns (network names, spec versions) that are irrelevant during code generation.
…lers) Add three new validation checks that run during both codegen and build: 1. source.abi cross-reference: Verify that each Ethereum data source's source.abi references an ABI name that exists in mapping.abis. 2. Unique names: Check that data source names and template names are each unique within their category. 3. Handler presence: Ensure Ethereum data sources and templates define at least one event, call, or block handler. To support these checks, extend the DataSource and Template structs with source_abi, event_handlers, call_handlers, and block_handlers fields, populated from graph-node's parsed types during manifest conversion.
…andlers) Port Ethereum-specific validation from graph-node's DataSource::validate(): - Validate data source kind is 'ethereum' or 'ethereum/contract' - Require source address when call or block handlers are present - Enforce block handler constraints (no duplicates per filter type, no mixing non-filtered with polling/once) - Validate receipt requires apiVersion >= 0.0.7 - Validate eth call declarations require specVersion >= 1.2.0 To support this, enrich gnd's simplified types: event_handlers now carry receipt and call-decl flags, block_handlers carry filter kind. These structural validations run during build (validate_manifest) but not during codegen to avoid blocking code generation for manifests with semantic issues. Also fix codegen snapshot tests to use valid manifests (ABIs listed, handlers defined) per validation added in previous commits.
After compiling AssemblyScript mappings to WASM, parse the compiled binary with wasmparser and verify that all handler names declared in the manifest (event, call, and block handlers) exist as exported functions. This catches mismatches between the manifest and the actual WASM code early, during `gnd build`, rather than at deployment time.
Add validation that event handler event signatures and call handler function signatures match entries in the source ABI. This catches errors like referencing a nonexistent event or using wrong parameter types early, during codegen rather than at deploy time. The matching logic mirrors graph-node's runtime matching: - Event signatures support both indexed and non-indexed forms - Non-indexed fallback only matches if a single event variant exists - Function signatures use canonical form: name(type1,type2,...) Changes: - Add `event` field to EventHandler (Solidity event signature) - Replace `call_handlers: Vec<String>` with `Vec<CallHandler>` struct containing both the function signature and handler name - Add validate_handler_signatures() wired into both codegen and build - Update test ABI fixtures to include actual events/functions
a43f909 to
8d75cdf
Compare
Replace ethabi types with alloy equivalents in gnd's validation module: - Add Param and EventParam re-exports to graph::abi - Use JsonAbi instead of ethabi::Contract for ABI loading - Use Event::signature() and Function::signature() from alloy - Replace manual ethabi_event_signature/ethabi_function_signature helpers - Keep event_signature_with_indexed() using EventParam::selector_type() This is the first step toward removing the ethabi dependency from gnd.
Replace ethabi with alloy (via graph::abi) for ABI code generation, eliminating the redundant ethabi dependency. Key improvements: - Use JsonAbi/DynSolType instead of ethabi Contract/ParamType - Remove ComponentNames workaround (~130 lines) since alloy's Param.components preserves tuple component names natively - Simplify AbiCodeGenerator to just (contract, name) fields - Use FunctionExt::signature_compat() for function signatures - Simplify ABI preprocessing (alloy handles most serde defaults)
1ad777f to
a6908ee
Compare
This is all Claude-generated code, I haven't even looked at it or tried it. It's mostly here for informational purposes